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Abstract 

In this paper, a novel dual estimation methodology is developed for both time-varying parameters 
and states of a nonlinear stochastic system based on the Recursive Prediction Error (RPE) concept and 
Particle Filtering (PF) scheme. Our developed methodology in based on a parallel implementation of state 
and parameter estimation filters as opposed to using a single filter for estimating the augmented states and 
parameters. The convergence and stability of our proposed dual estimation strategy are shown formally 
to be guaranteed under certain conditions. The proposed dual estimation framework is then utilized for 
addressing the challenging problem of fault diagnosis of nonlinear systems. The performance capabilities 
of our proposed fault diagnosis methodology is demonstrated and evaluated by its application to a gas 
turbine engine through accomplishing state and parameter estimation under simultaneous component 
fault scenarios. The health parameters of the system are considered to be slowly time-varying in the 
steady state condition of the engine operation. Extensive simulation results are provided to substantiate 
and justify the superiority of our proposed fault diagnosis methodology when compared to two other 
alternative diagnostic techniques that are available in the literature. 

I. INTRODUCTION 

State estimation of systems is a fundamental problem in control, signal processing, and fault 
diagnosis fields [1]. Investigations on both linear and nonlinear state estimation and filtering in 
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stochastic environments have been an active area of research during the past several decades. 
Linear state estimation methods use a simpler representation of an actual nonlinear system and 
can provide an acceptable performance locally around an operating point and in the steady 
state operational condition of the system. However, as nonlinearities in the system dynamics 
become dominant, performance of linear approaches deteriorates and linear algorithms will not 
necessarily converge to an accurate solution. Although an optimal state estimation solution for 
linear filtering methods exists, nonlinear filtering methods suffer from sub-optimal or near- 
optimal solutions. Consequently, investigation of nonlinear estimation and filtering problems 
remain a challenging research in the above fields. 

Numerous studies have been conducted in the literature to solve and analyze standard nonlinear 
filtering problems [2]-[8]. These methods can be broadly categorized into [7]: (a) lineariza- 
tion methods (extended Kalman filters (EKF)) [2], (b) approximation methods using finite- 
dimensional nonlinear filters [3], (c) particle filter (PF) methods as one of the most popular 
Bayesian recursive methods for state estimation [4], (d) classical partial differential equation 
(PDE) methods for approximating a solution to the Zakai equation [5], (e) Wiener chaos ex- 
pansion methods [6], (f) moment methods [7], and (g) high dimensional nonlinear Kalman filter 
methods known as the Cubature Kalman filters [8]. 

Estimation of states and parameters of nonlinear systems have been tackled by using extended 
Kalman filtering (EKF) methods in [9], where the accuracy of parameter estimates is only 
validated through experimental results. An off-line parameter estimation method was developed 
in [10], in which the effects of each parameter on the system measurements were quantified 
by using the principal component analysis (PCA). However, this method is inherently off-line 
and cannot be used to track on-line variations of the system parameters. Dual state/parameter 
estimation methods have also been developed for nonlinear systems based on (i) EKF in [11], 
(ii) unscented Kalman filters (UKF) in [12] for slowly-varying parameters, and (iii) ensemble 
Kalman filters in [13] for static (fixed or constant) parameters. 

While dual estimation methods based on UKF achieve more accurate results as compared to the 
EKF-based dual methods, both of these methods are applicable to systems that are considered 
to be affected by the process and measurement noise with Gaussian distributions. Therefore, 
particle filter-based dual state/parameter estimation methods are introduced as powerful tools for 
nonlinear systems that are affected by non-Gaussian process and measurement noise [4], [14]. 
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In cases where the fixed or constant system parameters are unknown and need to be estimated 
from either on-line or off-line approaches, two main classes of estimation methods have been 
proposed in the literature based on the numerical particle filtering (PF) and the maximum 
likelihood filtering [15]. These methods are mainly developed to only estimate fixed (constant) 
parameters of a system. A combination of particle filtering and gradient algorithm based on the 
stochastic approximation technique was utilized in [16] to develop an adaptive fixed parameter 
estimation method. The efficiency of this method was assessed to be more reliable than general 
particle filtering methods that are developed for parameter estimation problems. 

In the framework of particle filters, a general method that is capable of simultaneously 
estimating the static (constant or fixed) parameters and the time-varying states of a system 
is developed in [17]. This work is based on the sequential Monte Carlo (SMC) method in which 
an artificial dynamic evolution model is considered for the unknown model parameters. In order 
to overcome the degeneracy concerns due to the particle filtering, kernel smoothing technique 
[18] as a method for smoothing the approximation of the parameters conditional density was 
invoked. The estimation algorithm was further improved by re-interpretation of the artificial 
evolution algorithm according to the shrinkage scaling concept. However, the proposed method 
in [18] is only applicable for estimating fixed parameters of the system and they used the 
augmented state/parameter vector for the purpose of system estimation. 

In the present paper, we propose to extend the particle filtering approach for state estimation 
by utilizing the recursive prediction error (RPE) scheme to estimate the slowly time-varying 
parameters in a dual state/parameter estimation filter as opposed to an augmented approach 
that was introduced in [17]. The RPE method is used for both off-line and on-line system 
identification by utilizing a quadratically convergent scheme [19], [20]. The concept of the 
kernel mean shrinkage [18] is also used in our proposed scheme to avoid overdispersion in the 
variance of the estimated parameters. 

Towards the above end, the main contribution of this paper is to utilize nonlinear Bayesian 
and Sequential Monte Carlo (SMC) methods to develop a unified framework for both the state 
and parameter estimation problems. Our methodology is based on solving the Bayesian recursive 
relations by using SMC methods. An on-line parameter estimation scheme is developed based 
on the recursive prediction error (RPE) method [19] by using the particle filter (PF) approach. 
Specifically, by using the prediction error to correct for the changes in the system parameters, 
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a novel method is proposed for parameter estimation of nonlinear systems based on the PR In 
the implementation of the algorithm, a dual structure is proposed for both state and parameter 
estimation in the PF framework. The hidden states, and the variations in the system parameters 
are estimated through two parallel filters. 

The other equally important problem that is considered in this paper deals with fault detection 
and isolation (FDI) of nonlinear systems, also known as the fault diagnosis (FD) problem. 
Proper utilization of FDI systems can improve the reliability and operational safety of industrial 
systems. The FDI techniques of interest in this work use the concept of analytical redundancy 
[21], thereby not requiring any additional and redundant hardware. However, these analytical 
methods are more challenging to develop due to requiring properties such as robustness against 
noise, model uncertainties and unknown disturbances [22], [23]. 

State estimation methods have been applied for development of model-based fault diagnosis 
(FD) and health monitoring techniques in recent years [24]. In these methods, accuracy of 
the considered dynamical model plays an important and crucial role in the performance of the 
developed FD scheme. Therefore, methods that are designed by explicitly taking into account the 
nonlinear dynamical characteristics of the system are considered to yield more accurate solutions 
to the fault diagnosis problem [25]. 

Fault detection and isolation is achieved by using the concatenated wavelet transform variances 
and discriminate analysis in [26]. This method was proposed as an alternative to the multiple- 
model approach used for FDI. It was shown that it can improve the overall performance in fault 
classification of mechanical systems, however it is applicable to systems that are affected by 
the stationary noise processes. A fault detection and diagnosis scheme (FDD) for continuous- 
time stochastic dynamic systems with time delays and non-Gaussian variables is proposed in 
[27] based on an observer-based optimal FDD using the conditional probability distributions. 
However, in this method it is assumed that the output pdfs can be approximated by using the 
square root B-spline models. 

Component fault diagnosis of dynamical systems can be achieved through dual state/parameter 
estimation methods [24], [28] by first detecting the faulty competent and then by determining 
its severity and isolating its location. In this work our proposed dual state and parameter 
estimation strategy is applied for component fault diagnosis of a gas turbine engine based on 
the nonlinear gas path analysis approach [29]. The component fault diagnosis in gas turbine 
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engines is addressed by utilizing a multiple-model approach in [29], [30]. Whereas the multiple- 
model approach can detect and isolate faults based on the predefined faulty models, utilization 
of the parameter estimation scheme in our developed dual state/parameter estimation enables 
the diagnostic algorithm to detect, isolate and accurately identify the type and severity of 
simultaneous fault scenarios in the gas turbine system. 

The remainder of this paper is organized as follows. In Section II, the statement of the nonlinear 
filtering problem is introduced. Our main proposed dual state/parameter estimation scheme is 
developed in Section III, in which state and parameter estimation methods are first developed 
in parallel and separately and subsequently integrated together for simultaneously estimating the 
system state/parameters. The stability and convergence properties of the proposed scheme under 
certain conditions are also provided in Section III. In Section IV, extensive simulation results 
and case studies are provided to demonstrate and justify the merits of our proposed method for 
fault diagnosis of a gas turbine engine under simultaneous component faults. Finally, the paper 
is concluded in Section V. 

II. PROBLEM STATEMENT 
The problem under consideration is to obtain an optimal estimate of the states as well as the 
time-varying parameters of a nonlinear system whose dynamics is governed by a discrete-time 
stochastic model given by, 

x t+1 = f t (x t ,9 t ,ujt), (1) 

y t = h t (x t ,9 t ) + is t , (2) 

where f t : W 1 * x W 10 x IR n " — > M. nx is a known nonlinear function that specifies the state at 
the next time step t e N, 6 t E W 18 is the unknown and possibly time-varying parameter vector 
at time t that is governed by an unknown dynamics. The function h t : W 1 * x W le — > M. Uy is a 
known nonlinear function representing the map between the states, parameters and the system 
measurements at time t, oo t and v t are uncorrelated stochastic process and measurement noise 
sequences with covariance matrices Q t and R t , respectively. 

Our main goal in the dual state and parameter estimation problem is to approximate the 
following conditional expectations: 
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1 (x t )\v»M = J* 1 (Mv»M***, (3a) 

HMOt)\vi*,x t ) = J M0t)p(6 t \ yi *,x t )de t , (3b) 

where y 1:t = (yi, 2/2, ■■■■,Ut) denotes the available observations up to time t, and 4>i '■ ^ nx — > K, and 
02 : — > K are functions of states and parameters, respectively, that are to be estimated. The 
conditional probability functions p(x t \yi-.t, 9 t -i)dx t and p(O t \yi :t , x t )d8 t are to be approximated 
by the designed particle filters (PFs) through determining the filtering distributions according to 



N 



PN(x t \yi:t,9t-i)dx t = ^2w®6 x w(dx t ), 



PM(9t\yi:t,x t )d9 t = y^w^S 0 u)(d6t), 

3=1 

where the subscripts N and M imply that the state and parameter conditional probability 

(i) 

distributions are obtained from iV and M particles, respectively. Each particle x t has a weight 
Wx} and each particle 9^ has a related weight w# t \ where 5(.) denotes the Dirac-delta function 
mass that is positioned at x t or 9 t . 

Based on the approximation that is used in the representations in equation (4), our goal is 
to address the convergence properties of the to be designed estimators to their true optimal 
estimates and also to develop and demonstrate under what conditions this convergence remains 
valid. Subsequently in Section IV, the models (1) and (2) are used to address the problem of 
fault diagnosis of a dynamical nonlinear system when the system parameters are considered to 
be affected by a multiplicative fault vector parameter. 

III. Proposed Dual State/Parameter Estimation Framework 

In this section, the main theoretical framework of our proposed dual state/parameter filtering 
of the nonlinear system (1) and (2) is introduced and developed. 

A. Dynamic Model in Presence of Time-Varying Parameters 

Our first task is to represent the model (1) and (2) into a different framework for our subsequent 
theoretical developments. Let (f^J 7 , P) denote the probability space on which the three real 
vector-valued stochastic processes, X = {X t ,t = 1,2, ...},6 = {6 t ,t = 1,2,...} and Y = 
{Y t ,t = 1,2,...} are defined. The n x -dimensional process X describes the evolution of the hidden 



September 30, 2014 



DRAFT 



7 



states, the n e -dimensional process 0 describes the evolution of the system hidden parameters 

which are conditionally independent of the states, and the ^-dimensional process Y denotes the 

observation process of the system. 

The processes X and 0 are Markov processes with the associated initial state and initial 
parameter X 0 and 0 O , respectively. They are drawn from the initial distributions n XQ (dx 0 ) and 
7%(d#o)> respectively. The dynamic evolution of the states and parameters are modeled by the 
Markov transition kernels K x (dx t \x t -i, 6t-i) and Kg(d9 t \9 t -i, x t ) that also admit densities with 
respect to the Lebesgue measure x , such that 

P(X t eA 1 \X t _ 1 =x t - 1 ,e t - 1 =6 t - 1 )= / K x (xt\x t -i,e t -i)dxt, (5) 

p{e t eA 2 \e t - 1 = e t - 1 ,x t = x t )= [ K e {e t \6 t - U x t )&e u (6) 

for all A x G B(W X ) and A 2 G B(W 10 ), where B(R n *) and B(R n °) denote the Borel ^-algebra on 
R rix and R ne , respectively. The transition kernel K x (x t \x t _i,9 t _ 1 ) is a probability distribution 
function (pdf) that follows the pdf of the stochastic process in equation (1). The probability 
density function for approximating the parameter kernel transition K(9 t \9 t -i, x t ) is to be provided 
in the next subsequent subsections. 

Given the states and parameters, the observations Y t are conditionally independent and have 
the marginal distribution with a density with respect to the Lebesgue measure as given by, 

P(Y t eB\X t = x t ,Q t = 9 t ) = f B p(y t \x t ,9 t )dy t , (7) 

where p(yt\x t , 9 t ) is a probability density function which follows the probability density function 
of the stochastic process in equation (2). 

In the dual state/parameter estimation framework at first the state x t is estimated (which is 
denoted by x t \t) and this estimated value at time t is then used to estimate the parameter value 9 t 
at time t (which is denoted by 9 t \ t ). In many practical situations, the evolution of the parameters 
are not specified a priori, therefore it is necessary to consider a given evolution for the parameters 
in order to design the estimation filter. In our proposed dual structure for the state estimation 
filter, first the parameters are assumed to be constant at t — 1 at their estimated value 9 t _i\ t _i, 
and then for the parameter estimation filter they are evolved to the next time instant by applying 

The transition kernel K(dxt\x t -i) admits density with respect to the Lebesgue measure if one can write P(Xt £ dxt\Xt-i = 
x t -i) = K(dx t \x t - 1 ) = K(x t \x t -i)dx t . 
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an update law based on the recursive prediction error (RPE) method. The details regarding our 
proposed methodology are presented in the subsequent subsections in which the filtering of the 
states and parameters are described and developed. 

B. The State Estimation Problem 

For designing both the state and parameter estimation filters, our main goals are to approximate 
the integrals in equations (5) and (6) by invoking the particle filter scheme and to approxi- 
mate the estimate of the conditional state and parameter distributions. Consider 7r X(|tl (da; t ) = 
J M n x 7Tx t _i| t _i(d2;t-i)-^x(da; t |a; t _i, 9 t -i) as the a priori state estimation distribution before the 
observation at time t becomes available, and 7r0 t _ 1|(1 (d0 t _i) as the conditional distribution of 
the parameter at time t — 1. The a posteriori state distribution after the observation at the time 
instant t becomes available is obtained according to the following rule, 

7r X(|t (da; t ) oc p(y t \x ti e t ^ XAt _ 1 (dx t )ne t _ 1]t _ 1 {de t ^ 1 ). (8) 

For our to be designed state estimation filter the parameters are frozen and fixed to their 
estimated values at the previous time step, i.e. Q t _i\ t _i, hence it is assumed that 0 t _i| t _i is 
known for this filter. Therefore, the last term in the right hand side of equation (8) is set to one. 

The particle filter (PF) procedure for implementation of the state estimation and for determin- 
ing n x (dx t ) consists of two main steps, namely (a) the prediction step, and (b) the measurement 
update step (time update step). Consider one is given iV particles at time t. The prediction step 
utilizes the knowledge of the previous distribution of the states as well as the previous parameter 
estimate, that are denoted by % = 1,...,N} (corresponding to N estimated state 

particles that follow the distribution 7r Xt _ 1|t _ 1 (dx t _i)) and respectively, and the process 

model as given by equation (1). In other words, the prediction step is explicitly governed by the 
following equations for % — 1, N, as 

'ViA-Ht-i,^) (9a) 
jlp^-Ht-i), (9b) 

1 N (i) (i) 1 N (i) T 

i=i i=i 

(i) (i) 

where uo t denotes the process noise related to each particle x t ^_ 1 and is drawn from the noise 
distribution with the probability distribution function p Ut (.), and denotes the independent 
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samples generated from equation (9a) (with the transition kernel as given by K x {.)) through 
the N particles. The initial covariance of the noise term is chosen less than or equal to the 

(i) 

one in equation (1). Moreover, y^d-^ denotes the independent samples of the predicted outputs 
evaluated at samples, and S £t|t _ 1 denotes the a priori state estimation covariance matrix. 

For the first step, the one- step ahead prediction distribution known as the a priori state 
estimation distribution is now given by, 

i=i 

For the second step, the information on the present observation y t is used. This results in 
approximating 7r Xt|t (da; t ), where 9 t -\\t-i is considered as given from a parameter estimation filter 
and obtained from the distribution 7r^_ i ti (d0 t _i), where M > N is the number of particles in 
the parameter estimation filter (details are given in the Subsection III-C), and N is the minimum 
number of the required particles for the state estimator to converge. Specifically, according to 
the Bayes' rule, we have the approximation: 

7r; (dx t ) oc - r f rn — a \-n ^ ^ m 7j~n — \, since the parameter estimate at t — 1, i.e. 

Xt \t y lJ hn x J R n g p{yt\xtA-l)n£ tlt _ 1 {dxt)n™_ l ^_ l (d6t- 1 ) , f 

0*-i|t-i is assumed to be known and approximated from the mean of the a posteriori parameter 

p(f*l £ t'Li'^-i|*-i) 5 a (0 ( dxt "> 



estimate distribution at t — 1. Consequently, cc— — ^ m — . 

t|J n T,i=iP(yt\x t \ t _ 1 A-i\t-i) 

Remark 1. In order for the optimal state filter to exist, as per existence of the probability 
distribution Junction in [31], one requires that -^^2^ =1 p(yt\x^j._ v 6t-i\t-i) > 0. Hence, in the 
state estimation filter of the dual structure one must ensure that ^ X)i=i p{Vt\^t\t-i^ ^t-i|t-i) — 
r\ h , r\ h denotes a threshold that is selected for sufficiently large N and (9 t _i| t _i is the mean 
of the a posteriori distribution of 6 ( f] 1 ^ 1 denoted by 7r^_ i|t i (d# t _i) and approximated by the 
j = 1, M particles. 

It follows that for the a priori state estimation algorithm (9a)-(9c) to proceed, the condition 
in Remark 1 must be satisfied, otherwise a new particle population has to be generated and 
this condition is checked again [31]. A suitable initial choice for r\ h can be taken as the first 
standard deviation of the acceptable normalized error in the output measurement (the signal y t in 
equation (2)) in the steady state mode of the system operation. The condition stated in Remark 
1 is invoked to reduce the risk of the filter divergence according to [31]. 

(i) 

The particle weights w x ; are updated by the likelihood function (the importance function) 
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according to w$ ~ p Ut (y t - y^) = p{y t \x^ t _ v 0 t -i\t-i), where p Ut (.) denotes the probability 
distribution function of the additive noise of the output and is evaluated at y t — r Next, 
a resampling is performed to generate iV new particles as with replacements from 

{£t|t_i}£Li> where the probability for taking the sample k is P(x||]_ 1 = = wif = 

— ^ Vt]{ * |t ~ ( 1 fc'^ t ~ 1 } t ~^ — , and where w x k ) for k — 1, ...,N denotes the normalized particle weights. 

E k =iP(yt\^ t] t-v d t-i\t-i) 

In other words, the particle selection in the resampling step is performed for the particles with 
higher probabilities of p Vt (y t - y^). 

Although the total number of the particles after the resampling is the same as that before 
the resampling, certain particles for which the weight w x 1 } is greater than a threshold will be 
applied to their neighbors with lower weight values. After the resampling all the new particles 
are assigned equal weights as There are different methods for performing the resampling 
[4]. In this work, we have chosen the regularized particle filters (RPF) since they are capable 
of transforming the discrete-time approximation of the a posteriori state estimation distribution 
(dx t ) into a continuous-time one. Consequently, the resampling step is modified in such a 
manner so that the new resampled particles are obtained from an absolutely continuous-time 
distribution with N different locations x^l-^ from that of [32]. Therefore, the a posteriori 
state estimation distribution is approximated by 7r^ (dx t ) before performing the resampling by 
using the reguralized particle filter (RPF) structure [32], and by 7r^ (dx t ) after performing the 



resampling as provided below, 



Nreg N 



1=1 i=l 



£i=iP(2/tRlt_i>0f-i|t-i) 



N , N 



i=i i=i 



where x[ e9 ', Z = 1, ..,N reg denotes the regularized state vector evaluated at N reg points which 
are obtained from the absolutely continuous-time distribution of the particles between the first 
standard deviation of the maximum value of the sampled particles x^l-y and their minimum 
value that is divided into N reg desired regularized points. Hence, {x^ tl }f =l is obtained from 
the continuous-time distribution through the regularization kernel /C which is considered to be a 
symmetric density function on W ax [32]. The matrix A is chosen to yield a unit covariance value 
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in the new x^' t _ 1 population and AA T = E £t|t _ 1 . The constant b denotes the optimal bandwidth 
of the kernel, and x t \ t denotes the a posteriori state estimation at time t. 

We are now in a position to introduce our overall particle filter (PF) scheme for implementing 
the state estimation filter. Our goal for proposing this algorithm is to ensure that an approximation 

to E((f)(x t )\yi: t , 9 t -i) by <f>(x t ) = x t takes x t \ t ~ ir£ t]t (dx t ) = jj E*=i S £ w(dx t ), where 7r^ t (dx t ) 
denotes the a posteriori distribution of (after the resampling from as 

given by x t \t = ^ Zlili ^t\t-r The estimated output from the state estimation filter is also given 

by y t = ht(x t \ u 9 t -i\t-i)- 

The State Estimation Particle Filter Scheme 

1) Initialize the N particles, {x^}^ =1 ~ 7r xo (dxo) and parameters with 9 0 (the mean of the 
parameter initial distribution 7r0 o (d0 o ))- 

2) Draw ~ Pu t (-)> where p Wt (.) denotes a given distribution for the process noise in the 

(i) 

filter, and then predict the state particles x^j__ 1 according to equation (9a). 

3) If the condition in Remark 1 holds, that is YliLi p(yt\x^._ v 6t-i\t-i) > r\ h go to Step 4, 
otherwise return to Step 2. 

4) Compute : from equation (9b) to obtain the importance weights {w x l ]}f =l as w x 1 } = 
p(yt|^[ t _i,^-i|t-i), i = 1, AT, and normalize them to wil = ™ xt w ■ 

5) Resampling: Draw iV new particles with the replacement for each i = 1,...,N, according 
to Pix^i = Xqt-i) = Wxt i k = 1) •••) AT, from the regularized kernel /C where ~ 
7f^ |t (dx t ) as given by equation (11). 

6) Calculate x t \t from the conditional distribution given by equation (11), 

<i t ( da; t) = h £2=1 ^w,^^*) with e q uall y weighted xj t ) _ 1 as % = £ Eii^-r 

7) Update the parameters from the parameter estimation filter (to be specified in the next 
subsection). 

8) Set t := £ + 1 and go to Step 2. 

Following the implementation of the above state estimation filter, the parameter estimation 
filter that is utilized for adjusting the parameters is now described in detail. 

C. The Parameter Estimation Problem 

One of the main contributions of this paper is to develop a novel PF-based parameter estimation 
filter within our proposed dual state/parameter estimation framework by utilizing the recursive 
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prediction error (RPE) concept. For this part of the methodology, it is assumed that the parameters 
are not known a priori and are also time-varying. Moreover, the estimated states obtained from 
the state estimation filter provided in the previous subsection will be used. In other words, in 
the parameter estimation filter the estimated states are used as x t ~ x t = x t \ t . Therefore, it is 
imperative that one considers a dynamical model corresponding to the parameters evolution in 
order to estimate the density function 7T0 t|t (d0 t ). 

The most common dynamical model that is considered for parameter propagation (in case 
of fixed system parameters) is the conventional artificial evolution law in which small random 
disturbances are added to the state particles (parameters) between the time steps [17]. However, 
in the present work, the conventional artificial evolution update law for the parameters is modified 
to include the output prediction error as a term allowing one to deal with the time variations in 
the parameters that can affect the system output. 

In order to derive an update law for the parameters, an algorithm based on the recursive 
prediction error (RPE) method is proposed by minimizing the expectation of a quadratic function 
J(9 t -\) with respect to 6 t -\. The considered cost function is to be minimized with respect to 
9 t -i, since our parameter estimation algorithm for finding the distribution of the a posteri- 
ori parameter estimate is based on the kernel smoothing using the shrinkage of the particle 
locations. This method attempts to force the particles towards their mean value in the pre- 
vious time step, i.e. the estimated value of 9 t _i that is denoted by O t -\\t-i (before adding 
noise to the particles), which is also used in the state estimation filter for approximating x t \ t . 
Therefore, our goal is to investigate the convergence properties of Q t -\\t-\ whose bounded- 
ness ensures the boundedness of 9 t \ t . Towards this end, the cost function is now selected as 
E(J(9 t _ 1 )\y 1:t _ 1 ,x t ) = J J(9 t _ 1 )p(9 t _ 1 \y ht _ 1 , x t )d9 t _ 1 , where the integral is approximated in 
the PF by E(J(9 t ^)\y 1:t ^x t ) « £ 

The function J(^ 1 | t _ 1 ) now represents a quadratic function of the output prediction error 
related to each particle j, j = 1, M. The prediction error is now defined by e(f , O®^) = 
e t^ — Vt — ht(x t \t, ^H.\| t _i)> an d ^\| t _! denotes the particle related to the estimated value of 
the parameter whose true value is denoted by 9*_ x and which is clearly assumed to be unknown. 
Therefore, we define J(^ 1 i t _ 1 ) = \ E(/(e(r, 9^\ t _^))) in which the expectation is 

taken over the observation sequence of k samples. Let us now select the quadratic criterion 

J(e(f, aS 
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The following modified artificial evolution law is now proposed for the parameter update in 
the particle filter for generating j = 1, M parameter particles that correspondingly determines 
the distribution from which the a priori parameter estimate is obtained as follows, 

m? = + ltRl j W<t, (13b) 

0<$_ x = A t mf + (/ - A t )m t _! + C t °' } , (13c) 
1 M 

ytt-x = h{Sk\J^\ d3d) 

where denotes the particles obtained from the modified RPE-based artificial evolution 

law (13a), i\) t = -M = ^(^itA-nt-i) ^ wWch when evaluated at i s denoted by 

ot>t-i\t-i o<>t-i\t-i 1 -"-I 1 1 

ipt*\ It is a time-varying step size design parameter, ~ A/"(0, (/ — A$)Vq ) denotes the 
zero-mean normal increment particles to the parameter update law at each time step with the 
covariance matrix (I — Aj)Vg i through the use of the kernel smoothing concept, A t denotes the 
shrinkage matrix and V§ denotes the covariance of the parameter estimation in the previous 
time step t — 1. The kernel shrinkage algorithm attempts to force the distribution of the parameters 
particles towards the mean of their distribution in the previous time instant as denoted by rh t _i 
by applying the shrinkage coefficient matrix A t to the obtained mf \ The processes d'f-m-i an( * 
are conditionally independent given observations up to time t. Moreover, = E[e^e^ T ] 
denotes a time-varying coefficient which determines the updating direction and is selected as 
a positive scalar to ensure that the criterion (12) can be minimized by changing 9^}^ in the 
steepest descent direction. The a priori parameter estimation particle is denoted by 9^_ v The 
measurement equation (13d), is used to distinguish the evaluated output obtained by the 

parameter estimation filter from the one obtained by the state estimation filter, as provided in 
the Subsection III-B. The equations (13a)-(13d) are to be used for implementing the parameter 
estimation particle filter scheme. 

In Theorem 1 that is provided in the Subsection III-E, it will be shown that the update law 
(13a) can guarantee the convergence of the parameter estimate particles 9 ( t j } 1 u 1 , j = 1, ...,M 
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(after the resampling step), to a local minimum of E(J(^ 1 | t _ 1 )|?/ 1:t _ 1 , x t ) which is located in 
a compact set of {x t ,9 t } for which the functions f t (x t ,9 t ,uJt), and h t (x t ,9 t ) are sufficiently 
smooth with respect to x t and 9 t , so that a small change in x t or 9 t cannot result in a big change 
in their estimates. This set is denoted by D M . 

The parameter update law (13a) is known as the RPE-based modified artificial evolution law. 
This update law contains a term in addition to the independent normal increment $\ The 
estimated parameter from this update law is invoked in the PF-based parameter estimation filter 
to represent the distribution from which the parameter particle population for the next time step 
is chosen. Therefore, the above proposed RPE-based modified artificial evolution law enables the 
PF-based estimation algorithm to handle and cope with the time-varying parameter scenarios. The 
time-varying term 7ti2p'\ acts as an adaptive step size in equation (13a), therefore our algorithm 
can also be considered as an adaptive step size scheme. 

In order to ensure that the obtained from the modified artificial evolution law given by 
equation (13a) remains in D M , the following projection algorithm is utilized that forces 9^-i 
to remain inside the compact subset D of D M (D C D M ) according to the following procedure 
[19], 

1) Choose a factor 0 < ji < 1, 

2) Compute 9^_, := 7t i#' (y t ~ h^J^^)), 

3) Construct ^ 1 :=^ ) 1|t _ 1 + ^ ) _ 1 , 

4) If G D M go to Step 6 else go to Step 5, 

5) Set 9$! = and go to Step 3, 

6) Stop. 

Consequently, the one-step ahead prediction distribution of the parameter 9 t is now obtained as, 

M 



tW = ^fe W ' (14) 



7T, 



'I 



In the measurement update step, the approximation to n e (d9 t ) that is denoted by jtgf (d9 t ), 

is given according to the Bayes rule as (d9 t ) oc - r - r 1—, — — , ,„ , N r . Since 

the state estimate at time t, i.e. x t \t is assumed to be known and approximated as the mean of 
the a posteriori state estimate distribution 7r^ (dx t ) (from a filter with N particles), therefore, 

ZrM ( A Q \ „ "t\t-l 



m 2^j=i p(yt\ x t\t>v t \t-i) 
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Remark 2. In order for the optimal parameter filter to exist, in the approximation of iTQ t ^(d9 t ) 
one must ensure that YljLi P(yt\%t\u @t\t-i) — r th> where r\ h denotes a threshold that is selected 
for sufficiently large M. 

(i) 

Consequently, as the present observation y t becomes available, the particle weights Wg are 
updated by the likelihood function according to ~ VvXVt — V^t-i) = P(Ut\^t\t, wnere 
p Vt (.) denotes the pdf of the measurement noise. This can now be expressed by using the 
normalized weights wP as Tff (d9 t ) = E^t w^6 sU ) (d9 t ), where = . 
Following the resampling/selection step, an equally weighted particle distribution ir^ t (d9 t ) is 

obtained as n^f (d9 t ) = h Y^f=i ^gU) (d#t) f° r approximating n e (d9 t ), and the resampled 

* I * ■> t\t—\ ' " 

(selected) particles that are denoted by follow the distribution 7t^ t (d9 t ). It must be noted 
that in our proposed parameter estimation filter the state is taken as the available estimated 
x t \t- Therefore, the a posteriori parameter estimation distribution is approximated by a weighted 
sum of Dirac-delta masses as 7f^(d# t ) before performing the resampling and with an equally 
weighted particle distribution approximation as ir^ t (d9 t ) according to 

M 

rM ( A Q \ ~ V^.T.CJ). 



t|t - 1 

-0") a PiMlxtiJ^) 

E,=iP(ytl^it^t| t -i) 

1 M 1 M 

j=l j"=l 
where denotes the normalized parameter particle weight, is obtained from the 

resampling/selection step of the scheme by duplicating the particles having large weights 
and discarding the ones with small values to emphasize on the zones with higher a posteriori 
probabilities according to P(9 ( t ^_ 1 = ^t-i) = ' ^ = ^->-—>M. In our proposed filter the 
residual resampling method is used to ensure that the variance reduction among the resampled 
particles is guaranteed [33]. 

Therefore, an approximation to E(0(0 t )|yi :t , x t ) by 4>(9 t ) = 9 t takes on the form 9 t \ t ~ 
7rF (d9 t ) = tjEci^u) (d9 t ), where ixf- (d9 t ) denotes the a posteriori distribution of the 

t\t ivl J y t i t _i 'I* 

parameter estimate (after performing the resampling from 9^_ ± ). The resulting estimated output 
of this filter is obtained by y t = h t (x t \ t ,9 t \ t ). The explicit details for implementation of the 
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parameter estimation filter are now provided below. 

The Parameter Estimation Filter 

The particle filter for implementation of the parameter estimation is described as follows: 

1) Initialize the M particles for the parameters as {9 J 0 }jLi ~ 7T0 o (d0 o ), and use the initial values 
of the states as x 0 which represents the mean of the states initial distribution Tc Xo (dx 0 ). 

2) DrawCF^(0,(/-A t 2 )^_ i|t J. 

3) Predict §® v j = 1,...,M from equations (13a)-(13c). 

4) If jj Sjli p{Ut\xt\t, Qt(t-i) ^ r th (according to Remark 2) go to Step 5, otherwise return to 
Step 2. 

5) Compute the importance weights {w^y^L^w^ 1 = p(yt\xt\t,^l\t-i)ij — 1,—,M, and 



,0')\m „„0') _ u , qU) 

t\t 

normalize them to w a 



T M w (j) ' 

6) Resampling: Draw M new particles with replacement for each j = 1,...,M, P(9^l_ 1 = 
^i) = * = 1, -,M, where ~ ^(dfl,) = E^tfg^Jdfc). 

7) Construct <L t from the conditional distribution 7rF (d0 t ) = ^ E?li ^0) (d#t) with equally 

* I* -' tit — i 



weighted 0^ as ^ = £ fl^ 



) 

t \t-l aB ^ — M ^j=l u t\t 

8) Set t — t + 1 and go to Step 2 of the state estimation filter as provided in the Subsection 

m-B. 

As stated earlier the kernel from which the parameter particles i.e. for the next time step 
is chosen is a Gaussian kernel and its mean is obtained from and its variance is obtained 
based on the kernel smoothing consideration that is provided in the next Subsection III-D. In 
the subsections below the required conditions for the boundedness of the parameter transition 
kernel K e (.) are also investigated and developed. 

D. Kernel Smoothing of the Parameters 

In this subsection, the kernel smoothing approach [18] is utilized to ensure that the variance 
of the normal distribution which is obtained based on the modified artificial evolution law for 
the parameter estimates remains bounded. 

Consider the modified artificial evolution law (13a) in which ffl is a normal zero-mean 
uncorrelated random increment to the parameter that is estimated at time t — 1. As t — > oo, the 
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variance of the added evolution increases and can therefore make # t ^_i m the modified RPE- 
based artificial evolution (13a), and consequently in (13c) become completely unreliable. 
This phenomenon is known as the loss of information that can also occur between two consequent 
sampling times. On the other hand, since 9 t \ t is time-varying, generally there will not exist an 
optimal value for the variance of the evolution noise ffl that remains suitable for all times. 

Consequently, the idea of the kernel shrinkage has been proposed in [18]. In the kernel 
shrinkage approach [17], for the next time step one takes the mean of the estimated parameter 
distribution in the particle filter according to 

( 16 ) 

= M{A t m? + (/ - A t )m t _i, (/ - A 2 t )V §t _ iit J, 

(i) 

where m t for j = 1,...,M, is obtained from (13b). By utilizing this kernel shrinkage rule, 
the resulting normal distribution retains the mean m t _i and has the appropriate variance for 
avoiding over-dispersion relative to the a posteriori sample. The kernel shrinkage forces the 
parameter samples towards their mean before the noise ffl is added. In our proposed approach 
the changes due to the parameter variations are considered in the mean of the parameter estimate 
distribution through the modified artificial evolution rule, therefore the mean of the distribution, 
i.e. fht-i, itself is time- varying and the kernel shrinkage ensures a smooth transition in the 
estimated parameters even when they are subjected to changes. To eliminate the information 
loss effect, the variance taken from both sides of equation (13c) results in V# i = AjV^ i + 
(I — Af)Vx =Vn , that ensures the variance of the added random evolution would not 
cause over-dispersion in the parameter estimation algorithm for all time. 

The following proposition specifies an upper bound on the shrinkage factor. This upper bound 
is calculated in the worst case, that is when the parameter is considered to be constant but 
the modified evolution law (13a) is used in the parameter estimation filter for estimating it. 
Utilization of this upper bound in the kernel shrinkage algorithm ensures the boundedness of the 
variance of the estimated parameters distribution that is obtained according to the RPE-based 
artificial evolution update law and the kernel smoothing augmented with the shrinkage factor. 
Proposition 1: Upper bound on the kernel shrinkage factor. Given the parameter update law 
(13a), the estimated parameters conditional normal distribution based on the kernel smoothing 
as given by equation (16), result in an upper bound for A t that is obtained as A t < 1(1 — 
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[I . Jii^ 1 —^2 — .t-.Ii 11 ' 1 J ), where ip/' in equation (13a) is considered as constant 



t-1 t-1 "t-llt-l 



between the time steps t and t — 1 and is denoted by ^. Moreover, a min and a max denote the 
minimum and the maximum eigenvalues of a matrix, respectively, W t denotes the variance of the 
added noise, V y denotes the variance of the measurement noise, and P tmax = max^T^}^. 
Proof: Let us consider the modified artificial evolution equation (13a). Let V(.\yi :t ) denote 
the variance of a given stochastic process assuming that the observations up to time t are 
available, and C(., -\yi;t) denote the covariance of two given stochastic processes by assuming 
that the observations up to time t are available. In our method the variations in the time-varying 
parameters are captured through the prediction error term e\ ' and then used in the modified 
artificial evolution model (13a) in order to determine the mean of the normal distribution from 
which the parameter population for the next time step is chosen. Our main concern here relates 
to the case when the actual system parameter is constant but the RPE term P^ ip^ e[ j \ and the 
evolution noise are still added to the parameter estimate rule. In this case, it is necessary to 
introduce correlations between the predicted parameter an d the random noise 

By taking into account the relationship between the variance of both sides of equation (13a) 
when the covariance matrix is assumed to be non-singular, and by assuming that {ip^jfi-^ = ^ 
is constant between t - 1 and t, yields V^JJ.Jj/i*) = V(0 t ( i ) 1 | t _ 1 |yi :t ) + P^W^ + W t + 
SC^Vi'^l^) +2C(P}H4 j \(i j) \y 1:t ) +2C(6l%_ 1} P}He?\y 1:t ). Furthermore, since 
= lt p}fi = 7t E(eS i) eJ j)T ) is a scalar, one can write V(P t {j) ^e { t j) \y 1:t ) = (E[P t {j) \y 1:t ]) 2 V(^e { t j) 

|yi :t ) + (E[*e?V:t]W^^ ensure 
that there is no information loss due to the modified artificial evolution law, one must have, 
V(6 t \t-i\yi*) = V(§ t _ 1 \ t _ 1 \y 1:t ) = V §ti]t _ i , which implies that, C(§ t . u Ct\Yt)+C{P t ^€ t ,Ct\Y t ) = 
-\W t - |P t 2 W 2 /I fT , where C(6^ t _ v P^^e^\y 1:t ) is ignored since it is assumed that when 
the parameter is constant, it is then independent from e\ ' . 

Therefore, negative correlations are needed to remove the effects of the unwanted information 
loss. In case of approximate joint normality of {9^\ t _^ (^\Y t ) and (P t ^\Pep'\ (^\Y t ), the 
conditional normal evolution is obtained as p{Q^\^t-i\t-i) ~ M{A t m^ + (I — A t )fh t -i, (I — 
Af)V§ t i ), where rrSp at each time step is found from equation (13b). The shrinkage matrix 
A t , is obtained as A t = I - [^V^ 1 ^ + P^Vy^V^J]. The upper bound for A t is 

obtained as A t < 7(1 - [ L max( m^ +pf ^PvF^j D' where is the u PP er bound 



n t-i\t-i " tmax " " "t-i|t-i 



September 30, 2014 



DRAFT 



19 

of Pt*\ j — 1, ...,M, and the normalization of the eigenvalue is performed to ensure that the 

ffmin(w«v^ _ +P t 2 max *^* T ic-i _ ) 

related fraction remains less than 1. Let a = 1 — /Tr , ^ii 1 1 — — TT , rrpT *i ' \ 1, therefore, 

e t-i\t-i e t-i\t-i 
the shrinkage matrix becomes A t = al with the shrinkage factor a. The smoothing matrix of 

the normal distribution variance is now obtained from the shrinkage factor as (1 — a 2 ) I, which 
guarantees that the normal distribution of the parameter estimates has a finite variance for all time. 
This completes the proof of the proposition. ■ 
The convergence of the estimated parameter particles O^u^ j = 1,...,M to the local 
minimum of E( J^^uJIy^t-i, x t ) is now investigated in the following subsection. The devel- 
oped convergence proof does not ensure the convergence of the RPE-based parameter estimation 
method to the true parameter value, but only to a set of zeros of the gradient of the chosen cost 
function. 

E. Convergence of the RPE-based Parameter Update Law 

In order to investigate the convergence of our proposed RPE-based modified artificial evolution 
law for updating the parameter particles distribution and to achieve a local minimization of 
E(J(0 t _i)|yi :t _i, x t ), consider equation (13a), where 7 t denotes a time-varying step size such 
that Hindoo 7 t = /i 0 > 0, where /i 0 is a small positive constant. The introduction of the step size 
7t is necessary to transform the discrete-time model (13a) into a continuous-time representation 
as shown subsequently. 

First, we need to state the following three assumptions A1-A3 according to [19], to guarantee 
the convergence of our proposed algorithm as presented in our main result below in Theorem 
1. Specifically, let us assume: 

Al. The vector {x t , 9 t } is defined such that it ranges over a compact set denoted by D M , for 
which the functions f t (x t ,9 t ,uJt) and ht(x t ,9 t ) are continuously differentiable with respect to 
the state x t as well as the parameter 9 t . 

A2. The function l(e(t : 9 t -i\t-i)) is sufficiently smooth and twice continuously differentiable 
w.r.t. e, and |Z ee (e(*,0 t _i| t _i))| < C for § t -i\ t -i e D M , where Z ee (e(t, 0 t -i\t-i)) denotes the 
second derivative of l(e(t,6 t -i\t-i)) w.r.t. e. 

A3. The data generation sequence y t (generated from equation (2)), is such that E(/(e(t, 9 t -i\t-i))) 
= J(9t-!\t-i) and ^ ^-^— l(e(t,0 t -i\t-i))] = -90t-i\t-i) exist for all 9 t -\\t-i e Dm, where 
E(Z(e(MVi| t -i))) = i£\ = t- K m<rA-i\t-i)). 
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It must be noted that the kernel shrinkage method, as stated earlier, attempts to retain the mean 
of the parameter estimation particles at time t near the estimated parameter in the previous time 
step t — 1, i.e. 0 t _i| t _i. Therefore, in the following theorem the convergence properties of 9t-i\t-i 
is now addressed. The main result of this section is stated below. 

Theorem 1. Consider the parameter estimation algorithm as specified by the equations (13a)- 
(13d). Also consider the a posteriori parameter estimate governed by equation (15). Let As- 
sumptions Al to A3 hold. It now follows that the particles 9^\ t _^ j = 1,...,M, and conse- 
quently the distribution of the estimated parameter particles approximated by the particle filter 
Trf ^dflt-i), w.p.l converge either to the set D c = {Ot-iu-M-iu-i G D M, w — M-Vi) 
= 0, j = 1, M} or to the boundary of D M as t — > oo. 

Proof: The existence of the projection algorithm in the parameter estimation scheme ensures 

that remains inside D M . According to equation (15), the a posteriori estimate of the 

parameter at time t is obtained from the resampled particles of the a priori parameter estimate 

0^_ ± , as 9 t \t = jfi Ylf=\ @i\t-v where 0^_ v is selected from the M particles of for which 

pvXvt - K%t\t,Qt\t-i)) nas hl g her probabilities. 

Consider equation (13c) for generating an d substitute from the RPE-based update 
rule of equations (13a)-(13b), to obtain the following expression for the resampled particles 

o$-i> namel y M 

J'=l 

By applying the sum operator to both sides of equation (17) to construct 9 t \ t yields, 

j M M M M M M 

^ E ^ft-i "^mE fl *iiit-i + m E m E e t-i\t-i ^^^EmE 
j=i j=i j=i j=i j=i j=i 

M M 1 M M 

+ A ^H iAM i] <t, e^-J + V 77 ^ E c W) = if E eV-i + A 4 E rfW^, ^11,-1). 

which results in § t \ t = O t -i\t-i + ± Xjii ItR?^?^^ &t-i\t-i)- Assumptions Al and A2 
ensure that the regularity conditions are satisfied according to [19], consequently, a differential 
equation associated with equation (13a) can be obtained by considering that At is a sufficiently 
small number and t, i are specified such that Y^k=t 7^ = Through a change of time-scales 
as t — > t and i — > r + At for a sufficiently small At, and by assuming that § t -i\t-i = 0, R\^ = 
Rfj\ A t = al is a constant matrix, the difference equation for 6^ is now expressed as 

§W ~ #0) + a ArR U) g(9^), (18) 
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where g(9^) = ^Yl\Jt^t^ e (^^t-i\t-i)- Consequently, considering Assumption A3, the dif- 
ferential equation related to the evolution of each single particle is obtained as, 

^ = aR$(r)g(9$(r)) = -aR$ ^)[ , (19) 

CLU 

where the subscript D is used to differentiate the solution of the differential equation (19) from 
the solution of the difference equation (18). Now, the required convergence analysis is reduced 
to showing the properties of the deterministic continuous-time system (19). 

Consider the function L^l^) = E(J(0 t C i ) 1 | t _ 1 ) = ^ Y^L i J0t-i\t-i) representing the 
expectation of a positive definite function through M data points for O^-ift-i' wn i cn is itself a 
positive definite function. Our goal is to evaluate the derivative of this function along the trajecto- 
ries of the system (19). According to Assumption A2, the second derivative of l(e(t, O^-m-i)) * s 
bounded, therefore the summation and derivative operations commute. According to Assumption 

A3 for eV-i e Dm, ^gV)) = s^MV^^-H 1 * = ^^d^J^-^' 
exists and is approximated by —g(9^). Therefore, considering that a > 0, and 11$ (t) is a 
positive scalar for j = 1, M (it represents the trace of a positive definite matrix at time r), 
one gets 

d r ,,sr,->- „. d 



M , M 



where the equality is obtained only for 9 d (t) e -De- Therefore, as t — > oo either fl^u! and 
consequently, it i w.p.l tend to Z)c or to the boundary of .D^vt, where w.p.l is with respect to 
the random variable related to the parameter estimate particles. This completes the proof of the 
theorem. ■ 
It was indicated earlier that the above theorem can only guarantee the boundedness of the 
estimated parameter distribution from the particle filters and not its convergence to the optimal 
distribution. Based on the results of Theorem 1 and Proposition 1 one can ensure that the 
probability density function p(^|^i| t _i) in the particle filter remains bounded, and therefore 
the related kernel Kff(d9 t \9 t -i, x t -\) (in the particle filter) is also bounded. 

Remark 3. Assuming that Remarks 1 and 2 are satisfied, and also by assuming that p(y s \x s ,9 s ) < 
oo, and K x (x s \x s -i,0 s -i) < oo, the boundedness of the parameter estimation transition kernel 
Ke(9 s \9 s ^i,x s ) is also ensured from the Proposition 1 and Theorem 1. Therefore, the convergence 
of the proposed dual state/parameter estimation filter to their optimal values, for {x t , 9t} € Dm 
can be guaranteed according to Theorem 3.1 in [31]. 
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Our proposed dual state and parameter estimation scheme is an effective scheme for the 
purpose of fault diagnosis of nonlinear systems where without loss of any generality one initiates 
operating the system from the healthy mode of operation. During the healthy operation of the 
system our proposed dual state and parameter estimation strategy can provide one with an 
accurate and reliable information on the health parameters of the system. This information 
can then be readily used to perform the tasks of fault detection, isolation and identification, 
following the presence of faults in the actuators, sensors, or the components of the system. 
In the following subsection, the application of our proposed approach for addressing the fault 
diagnosis of nonlinear systems is investigated. 

F. The Fault Diagnosis Formulation 

Determination and diagnosis of drifts in unmeasurable parameters of a system require an on- 
line parameter estimation scheme. In parametric modeling of a system anomaly or drift, generally 
it is assumed that the parameters are either constant or dependent on only the system states [34]. 
Hence, drifts in the parameters must be estimated through estimation techniques. 

In [35], various parameter estimation methods such as least squares, instrumental variables and 
estimation via discrete-time models have been surveyed. The main drawbacks with such methods 
arise due to the complexity and nonlinearity of the system that make parameter estimation 
schemes a nonlinear optimization problem that must be solved in real-time. In [36] a nonlinear 
least squares (NLS) optimization scheme is developed for fault identification of a hybrid system. 

The fault diagnosis problem under consideration in this work deals with obtaining an optimal 
estimate of the states as well as the time-varying parameters of a nonlinear system whose 
dynamics is governed by the discrete-time stochastic model, 

x t +i = ft(x t , 9j\(x t ),u t ), (20) 

y t = h t (x t ,9j\(x t )) + is t , (21) 

where f t : R nx x W 10 x R nw — y R Hx is a known nonlinear function, 0 t e R ne is the unknown and 
possibly time-varying parameter vector that for a healthy system is set to 1, X t : W lx — y W 10 
is a differentiable function that determines the relationship between the system states and the 
health parameters. The function h t : R nx x M. ne — y R ny is a known nonlinear function, ui t and 
v t are uncorrelated noise sequences with covariance matrices Q t and R t , respectively. According 
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to the formulation that is used in equations (20) and (21), the parameters 9 t to be estimated are 
considered as a multiplicative fault vector whose value is taken to be equal to 1 in the healthy 
mode of the system operation. 

The model (20) and (21) is now used to investigate the problem of fault diagnosis (FD), which 
in this work implies and signifies the problem of fault detection, isolation, and identification 
(FDI) when the system health parameters are considered to be affected by a multiplicative fault 
vector 9 t . The system health parameters are known functions of the system states, X(x t ), and the 
multiplicative fault vector 9 t which is to be estimated. In other words, the estimated parameter 
9 t \t will be used to generate residual signals for accomplishing the fault diagnosis goal and 
objective. The required residuals are obtained as the difference between the estimated values of 
the parameters in the healthy operational mode and the faulty operational mode of the system, 
that is 

n = e 0 - e t \ t , (22) 

where 9 0 denotes the estimated value of the parameter in the healthy operational mode and Q t \t 
denotes its a posteriori estimated value after the occurrence of the fault. It should be pointed out 
that the true value of the parameter is denoted by which is clearly not available in reality. 

In parameter estimation methods used for fault diagnosis of the system components, the 
residuals can be generated by comparing the estimated parameters that are obtained by either the 
ordinary least squares (OLS) or the recursive least squares (RLS) algorithms with the parameters 
that are obtained from estimation filters in the fault free case [37]. In the implementation of our 
proposed fault diagnosis strategy based on our developed state/parameter estimation framework, 
the parameter estimates will be considered as the main indicator to be used for detecting, 
isolating, and identifying the faults in the system components. The residuals are generated from 
the parameter estimates under the healthy and faulty operational modes of the system according 
to equation (22) as follows, 

9 0 = argmax(-log(p(0 o |j/i:T)), (23) 

where 9 0 denotes the estimated parameters with the probability density p(9 0 \yo-.T) (conditioned on 
the observations up to time T), that are obtained from the collected estimates under the healthy 
operational mode of the system and fitted to a normal distribution. The time window T is 
chosen according to the convergence time of the parameter estimation algorithm. The thresholds 
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to indicate the confidence intervals for each parameter are obtained from Monte Carlo analysis 
that is performed under different single-fault and multi-fault scenarios. The estimated parameters 
9 t \t are generated by following the procedure that was developed and proposed in the previous 
sections of this work. 

IV. Fault Diagnosis of a Gas Turbine Engine 

In this section, the utility of our proposed dual estimation framework when applied to the 
problem of fault diagnosis of a nonlinear model of a gas turbine engine is demonstrated and 
investigated. 

A. Model Overview 

The utility and application of our proposed PF method for state/parameter estimation in a 
gas turbine engine is presented in the next subsections. The performance of our proposed 
state/parameter estimation scheme is evaluated and investigated subject to deficiencies in the 
system health parameters due to injection of simultaneous faults. 

The mathematical model of a gas turbine that is used in this paper is a single spool jet 
engine as developed in [29]. The four engine states are the combustion chamber pressure and 
temperature, P cc and T cc , respectively, the spool speed N, and the nozzle outlet pressure Pnlt- 
The continuous-time state space model of the gas turbine is given as follows, 
T C c = — l —[{c p Tc6 mc ihc + rjccHnmi - c p T C cO mT m T ) - c v Tcc(O mc rh c + m { - 9 mT m T )}, 



N = 



C v 7Tl cc 

Vm C chO mT m T Cp(Tcc ~ T T ) - 9 mc m c c p (Tc - T d ) 



Pcc 1 



Pcc = 7f : — [{c p T c O mc mc + i]ccH n mi - c p Tc C 0m T m T ) - c v T C c{O mc mc +m f - 6 mT m T )} 

l cc c v m cc 

'yRT'cc 

+ —TT (0 mc m c + m f - 9 mT m T ) : 

Vcc 

Pnlt = T^(O mT m T + -——9 mc mc - mizzle), (24) 
Vm p + l 

For the physical significance of the model parameters and details refer to [29]. The five 
gas turbine measured outputs are considered to be the output pressure and temperature of the 
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compressor, the turbine, and the spool speed, namely 




1]], 



y 2 {t) = Pec, ys(t) = N, y 4 (t) = Pnlt, 



(25) 



In order to discretize the above model for implementation of our proposed dual state/parameter 
estimation particle filters, a simple Euler Backward method is applied with the sampling period 

of T s = 10 msec. 

The system health parameters are represented by the compressor and the turbine efficiency, 
r] C and rj T , respectively, and the mass flow capacities, m c and m T , respectively. A fault vector 
is incorporated in the above model to represent the effects of the system health parameters 
which are denoted by 9 = [9 Vc ,9 mc , 9 r)T ,9 mT ] T . Each parameter variation can be a manifestation 
of changes in the fault vector and which is considered as a multiplicative fault type. All the 
simulations are conducted in the cruise steady-state flight condition mode in which two of the 
system health parameters are considered to be slowly time-varying. 

In order to demonstrate the effectiveness and capabilities of our proposed algorithms, the 
simulation results corresponding to the conventional kernel smoothing method [17] (that is 
obtained without using the RPE term in the parameter estimation algorithm) as well as a method 
that uses stochastic optimization with the step size as a decreasing series of time [16] are also 
provided for performance comparisons. In our work, the adaptive step size (P^ = 7t-R^) is 
defined as the multiplication of a constant ^ t with R\ ; which is evaluated on-line based on the 
trace of the covariance matrix of the prediction error which is estimated from the maximum 
likelihood method. The residuals corresponding to the parameter estimates are obtained. Based 
on the percentage of the maximum absolute error criterion, a convergence time of 2 seconds is 
obtained in the simulation results for estimating both the states and the parameters corresponding 
to 25 runs of simultaneous faults with severities ranging from 1% to 10%. 

To choose the number of particles for implementation of the state and parameter estima- 
tion filters, a quantitative study was conducted. Specifically, based on the mean absolute error 
(MAE%) obtained in the estimation steady state operation and by considering the algorithm 
implementation computational time, the number of particles is chosen as N = M = 50 for both 
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state and parameter estimation filters in this application. Consequently, acceptable performance 
and convergence times are obtained. The shrinkage coefficient is also selected as 0.93. The 
initial distributions (mean and covariance matrices) of the states and parameters are selected to 
correspond to the cruise flight operational conditions as provided in [29]. 

The input fuel flow to the engine is changed (that is, it is decreased by 2%) one second 
after reaching the steady state condition. Afterwards, a simultaneous fault in all the 4 health 
parameters of the engine is applied at t = 9 sec. The fault for the compressor and the turbine 
efficiencies follow the pattern of a drift fault which starts at t = 9 sec and causes a 5% loss of 
effectiveness in the compressor efficiency by the end of the simulation time (i.e. at t — 19 sec), 
and a 3% loss of effectiveness in the turbine efficiency by the end time of the simulation time. 
Simultaneously, the mass flow capacities of both the compressor and the turbine are affected by 
a fault which causes a 5% loss of effectiveness. 

The dual estimation results corresponding to the above three methods are provided in Figures 
1 and 2 for the estimates of the states and health parameters, respectively. The presented 
simulation results show that our proposed method in terms of its convergence time and the 
MAE% outperforms the other two methods significantly. 

For the method based on the stochastic optimization with a decreasing step size, the step 
size after extensive trial and error is chosen as ( . fc+ ° 3 ^ 0 5 602 for this application, which yields a 
competition with our proposed adaptive step size method. However, it should be emphasized 
that one is generally interested in implementing an algorithm that is capable of tracking the 
parameter changes under all conditions and without the need for extensive fine tuning of the 
step size in each scenario and for each specific application. Nevertheless, the accuracy of the 
algorithm in [16] after fault occurrence, specifically in the case of slow time- varying changes in 
the system health parameters, degrades considerably. 

The residuals related to the compressor mass flow faults and turbine efficiency faults are shown 
in Figure 3. These results show that in case of changes in the engine input (at t — 1 sec) the 
selected residuals do not exceed their confidence bounds, and the results in Table 1(a) show that 
the maximum MAE% for both the state and parameter estimates in this scenario are between 
0.1% — 0.5% of their nominal values. However, in the worst case the post fault estimated MAE 
of ttit is 0.47% of its nominal value. The percentage of the MAE measurement (output) estimate 
error as given in Table 1(b) shows that after simultaneous fault occurrences the error increases 
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Fig. 1. Estimated states for the simultaneous faults using RPE-based (this work), Decreasing step size [16], and Conventional 
kernel smoothing [17] approaches. 

TABLE I 

(a)State/Parameter, (b)Output estimation percentage of MAEs in case of simultaneous fault scenarios 





(a) 






(b) 




State 


Before Fault 


After Fault 




Output 


Before Fault 


After Fault 
















•Pec 


0.2217 


0.2372 




T C 


0.2207 


0.2852 


N 


0.0535 


0.1061 




Pc 


1.2926 


1.3729 


T CC 


0.2086 


0.1928 




N 


0.0535 


0.1027 


^NLT 


0.3970 


0.4291 




T T 


0.1565 


0.1650 


VC 


0.1735 


0.1821 




P T 


2.1210 


2.2348 


m c 


0.2811 


0.3293 








VT 


0.1016 


0.1485 










0.4589 


0.4744 









when compared to their values before the fault occurrence. This is caused due to the accumulation 
of parameter estimation errors while all the four parameters are affected by a fault. 

To summarize, our proposed fault diagnosis algorithm is capable of detecting, isolating and 
estimating the component faults of a gas turbine engine with an accuracy of 0.3% for the 
compressor and 0.5% for the turbine faults. 

Fault Diagnosis Confusion Matrix Analysis 

Finally, in this subsection a quantitative study is performed by utilizing the confusion matrix 
analysis [38] to evaluate the increase in the rate of the false alarms and/or misclassifications 
of faults in our considered application when the detection thresholds are fixed at the values 
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Fig. 2. Estimated health parameters for the simultaneous faults using RPE-based (this work), Decreasing step size [16], and 
Conventional kernel smoothing [17] approaches. 
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Fig. 3. Residuals related to the simultaneous fault scenarios. 



that are obtained from the 25 runs of the simultaneous fault scenarios shown in Figure 3 (for 
the compressor mass flow fault and the turbine efficiency fault parameters). The confusion 
matrix data is obtained by performing Monte Carlo simulations for another 35 simultaneous 
fault scenarios with different fault severities and in presence of process and measurement noise 
having 4 different covariance levels ranging from 50% to 120% of the nominal values of the 
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TABLE II 

Confusion matrix for (a) 50% of the nominal noise covariance, (b) 120% of the nominal noise covariance 



(a) 



(b) 



Fault 



VC 
m c 

VT 
NoFault 



31 

0 
1 
1 

0 



0 
30 



28 

3 
I 



4 

29 

1 



0 
0 

1 
1 

33 



IT 
mi* 
NoFault 



29 
0 
1 
1 

0 



0 
29 
1 
1 

0 



2 
3 
27 

4 

2 



4 
3 
3 
26 

3 



0 
0 
3 
3 
30 



process and measurement noise covariances (according to [29]). In these scenarios, at each time 
more than one of the system health parameters are affected by component faults. 

The results are presented in Tables 11(a) and 11(b) in which the rows depict the actual number 
of fault categories applied and the columns represent the number of estimated fault categories. 
The diagonal elements represent the true positive rate (TP) for each fault occurrence case. The 
accuracy (AC), precision (P) and the false positive rate (FP) of our proposed algorithm are also 
evaluated from the results of the confusion matrix analysis according to the following formulae 
[38], 



AC 



£ 7 =i c 



FP 



Ej=l C 5j 
L 7 = l C 5j 



where q-,, i,j = 1, 5 denote the elements of the confusion matrix. In Table III the results of 
the confusion matrix analysis according to the above metrics for both Tables 11(a) and 11(b) are 
provided. The results show that as the noise covariances increase from 50% to 120% of their 
nominal values, the accuracy of the fault diagnosis algorithm decreases by 5.72%, and the false 
positive rate increases by 8.58%. The precision of the algorithm remains robust for diagnosis of 
the compressor health parameters, the precision decreases by 6.73% for the fault diagnosis of 
the turbine efficiency and also decreases by 7.69% for the fault diagnosis of the turbine mass 
flow capacity. 

TABLE III 

Confusion matrix Analysis results 



Noise Level 


AC% 


FP% 


P vc % 


Pm c % 




Pm T % 


50% of Nom. Cov. 
120% of Nom. Cov. 


86.29 
80.57 


5.71 
14.29 


93.94 
93.55 


93.75 
93.55 


77.78 
71.05 


74.36 
66.67 
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V. Conclusion 

In this paper, a novel dual estimation filtering scheme is proposed and developed based on par- 
ticle filters to estimate a nonlinear stochastic system states and time variations in its parameters. A 
dual structure is proposed for achieving simultaneous state and parameter estimation objectives. 
Performance results of the application of our method to a gas turbine engine under healthy 
and faulty scenarios are also presented to demonstrate and illustrate the superior capability and 
performance of our scheme for a challenging fault diagnostic application as compared to two 
other alternative techniques that are available in the literature. The estimation results accuracy 
in terms of the fault identification are also provided. The obtained results are validated by 
performing a confusion matrix analysis. 
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